Text Data Sources

Author

Luisa M. Mimmi

Published

May 10, 2024

Work in progress

—————————————————————————-

A. World Development Reports (WRDs)

—————————————————————————-

B. World Bank Projects & Operations:

Connections

Following the example of David Robinson on HN titles

Data sources:

  1. World Bank Projects & Operations: https://datacatalog.worldbank.org/search/dataset/0037800 https://datacatalog.worldbank.org/search/dataset/0037800/World-Bank-Projects---Operations
  1. World Bank - World Development Reports
  • Accessibility Classification:

Reference Tutorials

(Robinson and Silge 2022) (LDAL 2022) (edureka?!2019)

Benjamin Soltoff: Computing 4 Social Sciences - API

Benjamin Soltoff: Computing 4 Social Sciences - text analysis

Ben Schmidt Book Humanities Crurse Ben Schmidt Book Humanities

tidyTuesday cast on tidytext

  1. ✔️ MEDIUM articles: common words, pairwise correlations - 2018-12-04
  2. TidyTuesday Tweets - 2019-01-07
  3. Wine Ratings - 2019-05-31 Lasso regression | sentiment lexicon,
  4. Simpsons Guest Stars 2019-08-30 geom_histogram
  5. Horror Movies 2019-10-22 explaining glmnet package | Lasso regression
  6. The Office 2020-03-16 geom_text_repel from ggrepel | glmnet package to run a cross-validated LASSO regression
  7. Animal Crossing 2020-05-05 Using geom_line and geom_point to graph ratings over time | geom_text to visualize what words are associated with positive/negative reviews |topic modelling

References

Kaye, Ella. 2019. ELLA KAYE: Working with Text in R,” October. https://ellakaye.rbind.io/talks/2019-10-05-working-with-text-in-r/.
LDAL. 2022. “Tutorials.” https://ladal.edu.au/tutorials.html#5_Text_Analytics.
Robinson, David. 2017. “Words Growing or Shrinking in Hacker News Titles: A Tidy Analysis.” Variance Explained. June 8, 2017. http://varianceexplained.org/r/hn-trends/.
Robinson, David, and Julia Silge. 2022. [1] Welcome to Text Mining with R Text Mining with R. https://www.tidytextmining.com/.